Polynomial-Delay and Polynomial-Space Algorithms for Mining Closed Sequences, Graphs, and Pictures in Accessible Set Systems
نویسندگان
چکیده
In this paper, we study efficient closed pattern mining in a general framework of set systems, which are families of subsets ordered by set-inclusion with a certain structure, proposed by Boley, Horváth, Poigné, Wrobel (PKDD’07 and MLG’07). By modeling semi-structured data such as sequences, graphs, and pictures in a set system, we systematically study efficient mining of closed patterns. For a class of accessible set systems with a tree-like structure, we present an efficient depth-first search algorithm that finds all closed sets in accessible set systems without duplicates in polynomial-delay and polynomial-space w.r.t. the total input size using efficient oracles for the membership test and the closure computation for the pattern class. From the above results, we show that the closed pattern mining problems are efficiently solvable both in time and space for the following classes: convex hulls, picture patterns in 2-D planes, maximal bi-cliques, closed relational graphs, closed patterns for rigid motifs with wildcards.
منابع مشابه
Efficient Closed Pattern Mining in Strongly Accessible Set Systems
Many problems in data mining can be viewed as a special case of the problem of enumerating the closed elements of an independence system with respect to some specific closure operator. Motivated by real-world applications, e.g., in track mining, we consider a generalization of this problem to strongly accessible set systems and arbitrary closure operators. For this more general problem setting,...
متن کاملA Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs
Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...
متن کاملTenacity and some other Parameters of Interval Graphs can be computed in polynomial time
In general, computation of graph vulnerability parameters is NP-complete. In past, some algorithms were introduced to prove that computation of toughness, scattering number, integrity and weighted integrity parameters of interval graphs are polynomial. In this paper, two different vulnerability parameters of graphs, tenacity and rupture degree are defined. In general, computing the tenacity o...
متن کاملON THE EDGE COVER POLYNOMIAL OF CERTAIN GRAPHS
Let $G$ be a simple graph of order $n$ and size $m$.The edge covering of $G$ is a set of edges such that every vertex of $G$ is incident to at least one edge of the set. The edge cover polynomial of $G$ is the polynomial$E(G,x)=sum_{i=rho(G)}^{m} e(G,i) x^{i}$,where $e(G,i)$ is the number of edge coverings of $G$ of size $i$, and$rho(G)$ is the edge covering number of $G$. In this paper we stud...
متن کاملTime and Space Efficient Discovery of Maximal Geometric Graphs
A geometric graph is a labeled graph whose vertices are points in the 2D plane with an isomorphism invariant under geometric transformations such as translation, rotation, and scaling. While Kuramochi and Karypis (ICDM2002) extensively studied the frequent pattern mining problem for geometric subgraphs, the maximal graph mining has not been considered so far. In this paper, we study the maximal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009